Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX+ENH: Time conversions for CSV parser and HTML table parser #2329

Merged
merged 1 commit into from
May 19, 2023

Conversation

gethvi
Copy link
Contributor

@gethvi gethvi commented Mar 8, 2023

This PR:

  • Adds new TimeFormat class for time_format bot parameter. It improves performance, as it validates the parameter only once on instantiation of the bot class and not every time datetime is parsed (looking at you HTML Table parser). Also removes some code duplicity.
  • Changes CSV Parser time conversions. For some reason the CSV parser had it's own TIME_CONVERSIONS and it was very limited (e.g. from_format couldn't be used, see Having issues with intelmq.bots.collectors.file.collector_file #2326). This PR changes it to use DateTime.TIME_CONVERSIONS. Now CSV parser uses TimeFormat class for time_format parameter.
  • Changes HTML Table parser to use TimeFormat class for time_format parameter as well.
  • Changes DateTime conversion function names to consistent naming scheme starting with from_. Changes function signature to be consistent. Backwards compatible.
  • Updates some docstrings.
  • Fixes a bug in InvalidArgument exception.

@gethvi gethvi force-pushed the fix-csv-parser branch 2 times, most recently from d3ff5ab to 3c2d5f6 Compare March 8, 2023 14:53
@gethvi gethvi marked this pull request as draft March 9, 2023 09:55
@gethvi gethvi force-pushed the fix-csv-parser branch 12 times, most recently from 0ad0e84 to 208743b Compare March 9, 2023 15:58
@gethvi gethvi marked this pull request as ready for review March 9, 2023 16:05
@gethvi gethvi changed the title FIX: Fixes DateTime time conversions for CSV parser FIX+ENH: Time conversions for CSV parser and HTML table parser Mar 9, 2023
@gethvi
Copy link
Contributor Author

gethvi commented Mar 9, 2023

Ready for review.

@gethvi gethvi force-pushed the fix-csv-parser branch 4 times, most recently from 0c0c837 to d841782 Compare March 13, 2023 15:14
@gethvi gethvi force-pushed the fix-csv-parser branch 2 times, most recently from fc197fa to 9f14b68 Compare March 16, 2023 11:36
Copy link
Contributor

@kamil-certat kamil-certat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea :) I have a few small questions about some implementation details that I'm unsure what their goal is

intelmq/lib/datatypes.py Outdated Show resolved Hide resolved
intelmq/lib/datatypes.py Outdated Show resolved Hide resolved
intelmq/lib/datatypes.py Outdated Show resolved Hide resolved
intelmq/lib/datatypes.py Show resolved Hide resolved
intelmq/lib/datatypes.py Outdated Show resolved Hide resolved
intelmq/lib/datatypes.py Show resolved Hide resolved
@gethvi gethvi force-pushed the fix-csv-parser branch 2 times, most recently from b7c8119 to 11fd27f Compare May 17, 2023 09:28
@gethvi gethvi force-pushed the fix-csv-parser branch 2 times, most recently from 7e9c755 to b940af5 Compare May 17, 2023 09:34
* Adds new `TimeFormat` class for `time_format` bot parameter. It improves performance, as it validates the parameter only once on instantiation of the bot class and not every time datetime is parsed (looking at you HTML Table parser). Also removes some code duplicity.
* Changes CSV Parser time conversions. For some reason the CSV parser had it's own `TIME_CONVERSIONS` and it was very limited. This PR changes it to use `DateTime.TIME_CONVERSIONS`. Now CSV parser uses `TimeFormat` class for `time_format` parameter.
* Changes HTML Table parser to use `TimeFormat` class for `time_format` parameter as well.
* Changes `DateTime` conversion function names to consistent naming scheme starting with `from_`. Changes function signature to be consistent. Backwards compatible.
* Updates some docstrings.
* Fixes a bug in `InvalidArgument` exception.
@sebix sebix added this to the 3.1.1 milestone May 19, 2023
@sebix sebix merged commit 3be2460 into certtools:develop May 19, 2023
24 checks passed
@gethvi gethvi deleted the fix-csv-parser branch March 1, 2024 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants